大多数进化算法具有多个参数,它们的值大大影响性能。由于参数的常复相互作用,将这些值设置为特定问题(参数调整)是一个具有挑战性的任务。当最佳参数值在算法运行期间最佳参数值发生显着变化时,此任务变得更加复杂。然后是必要的动态参数选择(参数控制)。在这项工作中,我们提出了一个懒惰但有效的解决方案,即从一个适当缩放的幂律分布中随机地选择所有参数值(在那里这是有意义的)。为了展示这种方法的有效性,我们使用以这种方式选择的所有三个参数执行$(1 +(\ lambda,\ lambda))$遗传算法的运行时分析。我们展示该算法一方面可以模仿像$(1 + 1)$ EA这样的简单山羊,给出了onemax,领导者或最小生成树等问题的相同渐近运行时。另一方面,该算法对跳跃功能也非常有效,其中最佳静态参数与优化简单问题所需的静态参数非常不同。我们证明了具有可比性的性能保证,有时比静态参数所知的最佳性能更好。我们通过严格的实证研究来补充我们的理论结果,证实了渐近运行时期结果的建议。
translated by 谷歌翻译
$(1 +(\ lambda,\ lambda))$遗传算法是一种较年轻的进化算法,试图从劣质解决方案中获利。关于单峰的健身功能的严格运行时分析表明它确实可以比古典进化算法更快,但在这些简单的问题上,收益只有中等。在这项工作中,我们在多模式问题类中进行了该算法的第一个运行时分析,跳跃功能基准。我们展示了使用正确的参数,\ ollga优化任何跳跃尺寸$ 2 \ Le K \ Le N / 4 $的任何跳跃功能,在预期的时间$ O(n ^ {(k + 1)/ 2} e ^ {o( k)}} k ^ { - k / 2}),它显着且已经持续了〜$ k $优于基于标准的突变的算法与他们的$ \ theta(n ^ k)$运行时与它们的标准交叉的算法$ \ tilde {o}(n ^ {k-1})$运行时保证。对于离开局部跳跃功能的局部最佳的孤立问题,我们确定了导致$(n / k)^ {k / 2} e ^ {\ theta(k)} $的运行时间的最佳参数。这表明有关如何设置\ ollga的参数的一般建议,这可能会缓解该算法的进一步使用。
translated by 谷歌翻译
为了更好地了解进化算法(EAS)如何应对恒定健身的平台的理论理解,我们提出了$ N $ -dimensional高原$ _K $函数作为天然基准,分析$(1 + 1)$的不同变体EA优化它。高原$ _K $函数在最佳的半径k $的半径k $的第二个最佳健身高原。作为进化算法,我们使用任意无偏的突变算子以$(1 + 1)$ EA。用$ \ alpha $ \ alpha $ \ alpha的随机数量在这个运算符的应用中,并假设$ \ pr [\ alpha = 1] $至少具有一些小的子常值,我们展示了所有常量的令人惊讶的结果$ k \ ge 2 $,运行时$ t $遵循靠近几何一个的分布,其中成功概率等于翻转的概率为1 $和$ k $ bits除以高原的大小。因此,预期的运行时是该号码的倒数,因此只取决于翻转1美元和$ k $位之间的概率,而不是突变运算符的其他特征。我们的结果也意味着这里标准位突变的最佳突变率约为k /(en)$。我们的主要分析工具是在搜索点空间和汉明级空间上的马尔可夫链的综合分析,这是一种对其他高原问题也有用的方法。
translated by 谷歌翻译
Probabilistic Law Discovery (PLD) is a logic based Machine Learning method, which implements a variant of probabilistic rule learning. In several aspects, PLD is close to Decision Tree/Random Forest methods, but it differs significantly in how relevant rules are defined. The learning procedure of PLD solves the optimization problem related to the search for rules (called probabilistic laws), which have a minimal length and relatively high probability. At inference, ensembles of these rules are used for prediction. Probabilistic laws are human-readable and PLD based models are transparent and inherently interpretable. Applications of PLD include classification/clusterization/regression tasks, as well as time series analysis/anomaly detection and adaptive (robotic) control. In this paper, we outline the main principles of PLD, highlight its benefits and limitations and provide some application guidelines.
translated by 谷歌翻译
We study the multiclass classification problem where the features come from the mixture of time-homogeneous diffusions. Specifically, the classes are discriminated by their drift functions while the diffusion coefficient is common to all classes and unknown. In this framework, we build a plug-in classifier which relies on nonparametric estimators of the drift and diffusion functions. We first establish the consistency of our classification procedure under mild assumptions and then provide rates of cnvergence under different set of assumptions. Finally, a numerical study supports our theoretical findings.
translated by 谷歌翻译
In many real-world scenarios, the absence of external knowledge source like Wikipedia restricts question answering systems to rely on latent internal knowledge in limited dialogue data. In addition, humans often seek answers by asking several questions for more comprehensive information. As the dialog becomes more extensive, machines are challenged to refer to previous conversation rounds to answer questions. In this work, we propose to leverage latent knowledge in existing conversation logs via a neural Retrieval-Reading system, enhanced with a TFIDF-based text summarizer refining lengthy conversational history to alleviate the long context issue. Our experiments show that our Retrieval-Reading system can exploit retrieved background knowledge to generate significantly better answers. The results also indicate that our context summarizer significantly helps both the retriever and the reader by introducing more concise and less noisy contextual information.
translated by 谷歌翻译
Transformer models have achieved superior performance in various natural language processing tasks. However, the quadratic computational cost of the attention mechanism limits its practicality for long sequences. There are existing attention variants that improve the computational efficiency, but they have limited ability to effectively compute global information. In parallel to Transformer models, state space models (SSMs) are tailored for long sequences, but they are not flexible enough to capture complicated local information. We propose SPADE, short for $\underline{\textbf{S}}$tate s$\underline{\textbf{P}}$ace $\underline{\textbf{A}}$ugmente$\underline{\textbf{D}}$ Transform$\underline{\textbf{E}}$r. Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers. The SSM augments global information, which complements the lack of long-range dependency issue in local attention methods. Experimental results on the Long Range Arena benchmark and language modeling tasks demonstrate the effectiveness of the proposed method. To further demonstrate the scalability of SPADE, we pre-train large encoder-decoder models and present fine-tuning results on natural language understanding and natural language generation tasks.
translated by 谷歌翻译
Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data. Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks. However, knowledge bases (KBs) utilized in these studies are usually large-scale and static, in contrast to small, domain-specific, and modifiable knowledge bases that are prominent in real-world task-oriented dialogue (TOD) systems. In this paper, we showcase the advantages of injecting domain-specific knowledge prior to fine-tuning on TOD tasks. To this end, we utilize light-weight adapters that can be easily integrated with PLMs and serve as a repository for facts learned from different KBs. To measure the efficacy of proposed knowledge injection methods, we introduce Knowledge Probing using Response Selection (KPRS) -- a probe designed specifically for TOD models. Experiments on KPRS and the response generation task show improvements of knowledge injection with adapters over strong baselines.
translated by 谷歌翻译
Creating realistic virtual assets is a time-consuming process: it usually involves an artist designing the object, then spending a lot of effort on tweaking its appearance. Intricate details and certain effects, such as subsurface scattering, elude representation using real-time BRDFs, making it impossible to fully capture the appearance of certain objects. Inspired by the recent progress of neural rendering, we propose an approach for capturing real-world objects in everyday environments faithfully and fast. We use a novel neural representation to reconstruct volumetric effects, such as translucent object parts, and preserve photorealistic object appearance. To support real-time rendering without compromising rendering quality, our model uses a grid of features and a small MLP decoder that is transpiled into efficient shader code with interactive framerates. This leads to a seamless integration of the proposed neural assets with existing mesh environments and objects. Thanks to the use of standard shader code rendering is portable across many existing hardware and software systems.
translated by 谷歌翻译
In 2016-2017, TUS, the world's first experiment for testing the possibility of registering ultra-high energy cosmic rays (UHECRs) by their fluorescent radiation in the night atmosphere of Earth was carried out. Since 2019, the Russian-Italian fluorescence telescope (FT) Mini-EUSO ("UV Atmosphere") has been operating on the ISS. The stratospheric experiment EUSO-SPB2, which will employ an FT for registering UHECRs, is planned for 2023. We show how a simple convolutional neural network can be effectively used to find track-like events in the variety of data obtained with such instruments.
translated by 谷歌翻译